Discriminator-Enhanced Knowledge-Distillation Networks
نویسندگان
چکیده
Query auto-completion (QAC) serves as a critical functionality in contemporary textual search systems by generating real-time query completion suggestions based on user’s input prefix. Despite the prevalent use of language models (LMs) QAC candidate generation, LM-based approaches frequently suffer from overcorrection issues during pair-wise loss training and efficiency deficiencies. To address these challenges, this paper presents novel framework—discriminator-enhanced knowledge distillation (Dis-KD)—for task. This framework combines three core components: large-scale pre-trained teacher model, lightweight student discriminator for adversarial learning. Specifically, aids discerning generative-level differences between models. An additional score is amalgamated with traditional knowledge-distillation loss, resulting enhanced performance model. Contrary to stepwise evaluation each generated word, our approach assesses entire generation sequence. method alleviates issue process. Consequently, proposed boasts improvements model accuracy reduction parameter size. Empirical results highlight superiority Dis-KD over established baseline methods, surpassing tasks sub-word languages.
منابع مشابه
Entrainer-enhanced Reactive Distillation
The paper presents the use of a Mass Separation Agent (entrainer) in reactive distillation processes. This can help to overcome limitations due to distillation boundaries, and in the same time increase the degrees of freedom in design. The catalytic esterification of fatty acids with light alcohols C2-C4 is studied as application example. Because the alcohol and water distillate in top simultan...
متن کاملData-Free Knowledge Distillation for Deep Neural Networks
Recent advances in model compression have provided procedures for compressing large neural networks to a fraction of their original size while retaining most if not all of their accuracy. However, all of these approaches rely on access to the original training set, which might not always be possible if the network to be compressed was trained on a very large dataset, or on a dataset whose relea...
متن کاملLearning Loss for Knowledge Distillation with Conditional Adversarial Networks
There is an increasing interest on accelerating neural networks for real-time applications. We study the studentteacher strategy, in which a small and fast student network is trained with the auxiliary information provided by a large and accurate teacher network. We use conditional adversarial networks to learn the loss function to transfer knowledge from teacher to student. The proposed method...
متن کاملSequence-Level Knowledge Distillation
Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neura...
متن کاملTopic Distillation with Knowledge Agents
This is the second year that our group participates in TREC’s Web track. Our experiments focused on the Topic distillation task. Our main goal was to experiment with the Knowledge Agent (KA) technology [1], previously developed at our Lab, for this particular task. The knowledge agent approach was designed to enhance Web search results by utilizing domain knowledge. We first describe the generi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2023
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app13148041